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ORIGINAL ARTICLE 

Transcriptome profiling of human hippocampus dentate gyrus 
granule cells in mental illness 

R Kohen\ A Dobra^'^'^ JH Tracy^ and E Haugen^ 

This study is, to the best of our knowledge, the first application of whole transcriptome sequencing (RNA-seq) to cells isolated from 
postmortem human brain by laser capture microdissection. We investigated the transcriptome of dentate gyrus (DG) granule cells 
in postmortem human hippocampus in 79 subjects with mental illness (schizophrenia, bipolar disorder, major depression) and 
nonpsychiatric controls. We show that the choice of normalization approach for analysis of RNA-seq data had a strong effect on 
results; under our experimental conditions a nonstandard normalization method gave superior results. We found evidence of 
disrupted signaling by miR-182 in mental illness. This was confirmed using a novel method of leveraging microRNA genetic variant 
information to indicate active targeting. In healthy subjects and those with bipolar disorder, carriers of a high- vs those with a low- 
expressing genotype of miR-182 had different levels of miR-182 target gene expression, indicating an active role of miR-182 in 
shaping the DG transcriptome for those subject groups. By contrast, comparing the transcriptome between carriers of different 
genotypes among subjects with major depression and schizophrenia suggested a loss of DG miR-182 signaling in these conditions. 
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INTRODUCTION 

Schizophrenia, bipolar disorder and major depression are com- 
mon and severely disabling psychiatric conditions with a partially 
genetic background. ^"^ Family studies have shown co- 
aggregation of the major psychiatric disorders, and population- 
based studies have indicated shared genetic susceptibility loci.^"^ 
Further evidence for common etiological factors comes from 
similarities in gene expression changes observed in different 
diseases, which have implicated deficits in neurotransmission and 
mitochondrial function, elevated immune response and inflam- 
mation, and downregulation of genes expressed in oligo- 
dendrocytes.^ The goal of our study was to find common 
etiological mechanisms for these diseases through the identifica- 
tion of shared transcriptome changes. 

One of the brain regions most consistently implicated in mental 
illness is the hippocampus, a brain region involved in memory, 
cognition, mood regulation and stress response.^ ^ In subjects with 
schizophrenia, bipolar disorder or major depression, abnormalities 
in hippocampus structure or function as well as a broad range of 
gene expression changes have been described.^ The great 
majority of prior studies has been done in frontal cortex, but 
hippocampus has been repeatedly investigated as well.^^'^° 
However, interpreting changes in the hippocampal transcriptome 
is fraught with difficulty because of the different tasks performed 
by different hippocampal subregions. Consequently the areas 
CA1, CA3, and the dentate gyrus (DG) show large differences in 
gene expression; additional variability is introduced by functional 
differentiation along the long axis of the hippocampus.^^"^"^ The 
DG is of particular interest as it is one of only two brain regions 
where adult neurogenesis has been described.^^ A large body of 



literature has linked hippocampal neurogenesis with psychiatric 
illness, including affective disorders and schizophrenia.^^'^^ We 
therefore chose to investigate the transcriptome of DG granule 
cells, isolating them from the surrounding tissue and harvesting 
them by laser capture microdissection (LCM).^^ We believe our 
study is the first to combine LCM with RNA-seq. 



MATERIALS AND METHODS 

Human subjects 

Postmortem human brain tissue from 79 individuals was obtained from the 
Stanley Medical Research Institute (SMRI) Neuropathology Consortium, the 
UCLA Human Brain and Spinal Fluid Resource Center, and the University of 
Washington (UW) Neuropathology Core Brain bank. We investigated mid- 
hippocampus tissue from 79 subjects. Most (n = 60) subjects were from the 
SMRI Neuropathology Consortium, a well-described brain collection which 
has been extensively used in neuropsychiatric research.^^ The collection 
consists of four groups of 15 subjects each with schizophrenia, bipolar 
disorder, major depression and nonpsychiatric controls. Groups are 
matched by gender with nine males and six females per group, and by 
age, ranging from 25-68 years. Ten subjects, four males and five females 
ranging in age from 44-91 years old, were from the UCLA Human Brain 
and Spinal Fluid Resource Center. Of these, two carried a diagnosis of 
schizophrenia, one of bipolar disorder, two had suffered from major 
depression and five were nonpsychiatric controls. An additional nine 
nonpsychiatric controls were from the UW Neuropathology Core Brain 
bank. These subjects, five males and four females, ranged in age from 
78-91 years. 

The gender distribution (41-47% female) did not vary significantly by 
disease group (schizophrenia, bipolar disorder, major depression and 
nonpsychiatric controls). The mean age of our subjects did vary 
significantly by group (P<10~'^), however, because our nonpsychiatric 
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control subjects were on average older at death (64 ±20 years) than 
members of the three disease groups, schizophrenia (47±17 years), 
bipolar disorder (42 ±16 years) and major depression (47 ±17 years). We 
therefore evaluated the possible confounding influence of age by 
comparing DG transcriptomes between the seven youngest (age 19-44, 
mean 39 ±5.7 years) and the six oldest (age 90-95, mean 92 ±1.9 years) 
members of the healthy subject group. We identified four genes whose 
levels of expression appeared to be influenced by age {MAGI2, RASGRFl, 
USP24 and NUPW7). However, none of these genes were identified in any 
of our disease group comparisons, indicating that our results were not 
influenced by the age difference between psychiatric subjects and 
controls. 

This study was approved by the Institutional Review Board of the 
University of Washington and conducted in accordance with ethics 
guidelines for the use of human subjects in research. 



Laboratory methods 

Fresh frozen 14|jm slide-mounted coronal cryostat sections from mid- 
hippocampus were stained and dehydrated using the Arcturus HistoGene 
LCM frozen section staining kit (Life Technologies, Grand Island, NY, USA), 
and following the manufacturer's instructions. From each subject, triplicate 
samples of about 2000 DG granule cells each were harvested by LCM, 
using an Arcturus AutoPix LCM system and CapSure Macro LCM caps 
(Molecular Devices, Sunnyvale, CA, USA). Triplicates were processed 
separately during cell harvest, RNA extraction and aRNA amplification to 
reduce experimental noise introduced during these stages of the 
experiment. Harvested cells were removed from the caps and RNA 
extracted using PicoPure RNA isolation kits (Life Technologies). RNA was 
then linearly amplified over two rounds of aRNA amplification, using 
MessageAmp II aRNA amplification kits (Life Technologies), and following 
the manufacturer's protocol. The quality and concentration of aRNA was 
checked by spectrophotometry, and only samples with an A260/280 ratio 
>1.9 were used. Equimolar amounts of triplicate aRNA samples for each of 



the 79 subjects were then pooled for the preparation of sequencing 
libraries. To evaluate different normalization/scaling methods, a separate 
test data set was prepared. In this data set, duplicate aRNA samples 
(denoted A and B) from four randomly chosen subjects (T1-T4) were used 
to construct a set of eight libraries. 

Sequencing libraries were prepared using Total RNA Sequencing Kit (Life 
Technologies), following the directions for Whole Transcriptome Libraries, 
and analyzed with an Applied Biosystems SOLID 4 high-throughput 
sequencer with an average singe-end read length of 50 base pairs (bp) 
(Life Technologies). 

The genetic polymorphism rs76481776 of nniR-182 was genotyped using 
a StepOnePlus Real-Time PCR System and a TaqMan Custom SNP 
Genotyping Assay (Life Technologies). A amount of 50 ng genomic DNA 
was amplified in the presence of gene-specific primers and allele-specific 
fluorescent probes following the manufacturer's instructions. Genotypes 
were called using TaqMan Genotyper software. For quality control 10% of 
the samples were genotyped in duplicate, and the genotype distribution 
was tested for deviation from Hardy-Weinberg equilibrium, using a test. 



Data analysis 

Reads were mapped and counted using the Applied Biosystems software 
BioScope 1.2.1. Transcripts were mapped to genome build GRCh37/hg19 
(February 2009 assembly), using the UCSC RefGene annotations and the 
BioScope default seed-and-extend approach for mapping. Reads were 
mapped to the whole genome file, supplemented by a transcript 
annotation file allowing reads to align across known splice junctions with 
no gap penalty. Repeat and ambiguously mapped sequences were then 
removed from the counts files using a UCSC RepeatMasker file. We then 
used the BEDTools bamToBed program to generate files of read 
locations.^° We only counted reads mapping to the opposite strand. As 
in RNA-seq, even a gene with a single count among multiple subjects will 
be listed as 'expressed' in the sequencer output, very low-expressing genes 
need to be excluded by a threshold criterion to limit experimental 



Table 1. Clustering of samples according to different normalization/scaling strategies 
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Shown are cluster memberships for technical replicates (A and B) of four randomly chosen samples (T1-4). The principles guiding the different normalization 
strategies with respect to transcript or exon length are shown in column 2. Column 3 delineates the choice of two different scaling methods: R — counts are 
divided by the total number of reads per sample; S— counts are scaled to the total sum of gene x length products or quotients per sample. Column 4 indicates 
whether individual transcript reads are divided by the total number of mappable reads before entering the equation (N — no; Y — yes), that is, whether reads 
per transcript are considered as a fraction or all reads or not. A full description is given in the Methods section. The labels 1, 2, 3 and 4 define the four clusters; 
samples that belong to the same cluster receive the same label. Method 3 is analogous to the RPKM method. Only method 16 (bold) leads to the correct 
clustering. Methods 9 and 17 (italics) give the same clustering results — only the numbering of the clusters is changed — and perform slightly worse than 
method 16, with one falsely grouped sample (T3B). 
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noise.^^'^^ We thus only considered genes to be expressing at analyzable 
levels if their raw counts were greater than zero in at least 95% of subjects, 
that is, zero in no more than three of our 79 samples. For our test data sets 
which were used for the evaluation of different normalization strategies, 
this meant that all genes without mappable reads in any one of the eight 
samples were excluded (since one out of eight would have amounted to 
1 2.5% zero reads). 

The average number of total mapped reads per subject was 16 357 257. 
The average fraction of uniquely mapped reads was 45%. For multiple 
transcripts at a given genomic location, for example, owing to the 
presence of splice variants, only one transcript with the highest number of 
counts was included in the analyses. A total of 15 761 of the resulting 22 
075 transcripts failed our threshold criterion of expression above 
background; the remaining 6314 transcripts/genes were analyzed with 
regard to disease-specific expression profiles. Our test data set, used for 
the evaluation of different normalization strategies, contained 9858 
transcripts. The reason that our test data set contained a third more 
transcripts than the analysis data set lies with our exclusion of genes 
expressing at background, which led to a higher number of genes being 
dropped from the analysis data set. 

For the evaluation of different normalization approaches, we compared 
a panel of 17 different methods plus raw (non-normalized) data. 
Normalization methods differed by their use of exon vs transcript length 
data and different scaling approaches (Table 1; also see Supplementary 
Methods for a comprehensive mathematical description). Each of these 18 
approaches was applied to our test data set consisting of technical 
replicates from four subjects (T1-T4). We employed k-means clustering in 
an attempt to recover the four natural clusters in which technical replicates 
are paired, and evaluated methods by how well they recovered the natural 
clusters. 

To account for the fact that genes act cooperatively in biological 
systems, we developed a new analysis approach for comparison of 
transcriptome profiles which is based on the identification of genes which 
are, given the presence of all other genes, significantly involved in shaping 
a specific gene expression profile. This regression-based analysis approach, 
identified by the acronym SIcall (for 'significantly involved calls') is 
described in detail in the Supplementary Methods. To infer microRNA 
(miRNA) involvement from groups of significantly involved genes, we used 
TargetScan (Release 6.3, June 2012). Genotype by target gene expression 
interactions were modeled using two-way analysis of variance (ANOVA) 
models (see Supplementary Methods for details). 



RESULTS 

Superior performance of nonstandard normalization methods in 
postmortem human brain 

LCM combined with aRNA amplification from postmortem human 
brain is a powerful technique to obtain cell population specific 
transcriptome data, but poses technical challenges. RNA from 
postmortem human brain is subject to degradation owing to 
agonal factors and a postmortem interval between death and the 
preservation of tissue. This problem is compounded by LCM, 
during which some amount of RNA degradation inevitably occurs 
even with stringent RNAse-free technique.^^ In addition, aRNA 
production leads to shortening of transcriptomes over successive 
cycles of amplification."^"^""^^ 

In the most widely used normalization strategy for RNA-seq 
experiments, the RPKM method (reads per kilobase of exon model 
per million mapped reads), dividing raw counts by exon length 
reduces the bias that is introduced by the fact that longer genes 
accumulate more counts.'^'' In our sample of 96 subjects, average 
mapped reads (raw counts) were more strongly proportional to 
total transcript length (r = 0.427, P<10~'^) than to cumulative 
exon length, however (r= 0.080, P<10~^) (see also the 
Supplementary Figure). 

In principle, the shortening of measurable transcripts occurring 
as a result of partial mRNA degradation and aRNA amplification 
under our experimental condition disproportionally affects shorter 
genes. For example, a loss of 500 bp will remove 50% of the signal 
from a 1-kb transcript, but only 25% of the signal of a 2-kb mRNA. 
As a result, in our experiment shorter genes had 'noisier' levels of 



expression, as indicated by higher coefficients of variation of the 
raw mapped counts. This inversely proportional relationship was 
stronger for total transcript length (r=-0.104, P< 10""^) than for 
exon length (r = - 0.063, P< 10""^). 

To answer the question which normalization method would 
perform best under our experimental conditions, we designed and 
tested 17 different methods, plus no normalization. K-means 
clustering was used to compare the quality of different 
approaches, based on the assumption that the best performing 
method(s) would appropriately cluster technical replicates 
together and samples from the four different subjects as distinct. 
Only one method (#16) led to the correct clustering; two other 
methods produced results that were identical to each other with 
one subject placed in the wrong cluster (#9 and #17, Table 1). The 
top-performing method filters out a portion of the experimental 
noise by introducing a stronger bias against noisier, shorter 
transcripts, whereas the two runners-up reduce the inherent 
sequencing bias against short transcripts in a way that is similar to 
the RPKM method. As the top-performing method, noise 
reduction scaling (#16) became the basis for our subsequent 
analyses. Length scaling (# 9) was used to compare the effects of 
different normalization strategies, and to confirm results. 



Identification of transcriptome differences between subject 
groups 

We made the following seven comparisons (1) all mental illness 
(n = 50) vs nonpsychiatric controls (n = 29), (2) schizophrenia 
(n = 17) vs nonpsychiatric controls (n = 29), (3) bipolar disorder 
(n = 16) vs nonpsychiatric controls (n = 29), (4) major depression 
(n = 17) vs nonpsychiatric controls (n = 29), (5) schizophrenia 
(n = 17) vs bipolar disorder (n = 16), (6) schizophrenia (n = 17) vs 
major depression (n = 17), (7) bipolar disorder (n = 16) vs major 
depression (n = 17). Cumulatively these comparisons identified 
141 genes as likely to be involved in shaping DG expression 
profiles in mental illness (Supplementary Table 1). In contrast the 
'fold change' output of traditional regression methods, the 
weights given by SIcall analysis are a rough measure of the 
likelihood of a gene contributing to overall DG gene expression 
changes in mental illness, given the presence of all other genes 
represented in the transcriptome. 



Influence of the normalization and analysis method on gene 
identification 

Using an alternative normalization method, length scaling instead 
of noise reduction scaling led to the identification of 162 genes. 
Only 64 of these genes, in particular the most heavily weighted 
ones, were identified using both scaling methods. Hence, 
identification of the majority of genes was strongly dependent 
on the choice of normalization method. For the 64 genes 
identified in both versions of the scaled data, there was fairly 
good overlap in the pattern of weights generated by our different 
comparisons (Supplementary Table 2). 

The dependence of gene identification on normalization 
method was independent of the type of analysis method used. 
Using univariate logistic regression models led to the identifica- 
tion of 65 genes as differentially expressed (any of seven group 
comparisons, P< 0.01) using length scaled data, and 80 genes as 
differentially expressed (any of seven group comparisons, 
P<0.01) using noise reduction scaling, with an overlap of 26 
genes between the two normalization methods (data not shown). 
Hence, for either analysis method (SIcall vs a more traditional 
approach) only about a third of genes were reproducibly 
identified across different normalization methods. 
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Table 2. Targeting of SIcall and control gene sets by microRNA 
(miRNA) 



miRNA/miRNA family % Of genes targeted Set 2 vs NC 
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Listed are all miRNAs/miRNA families targeting at least five (25%) of Set 1 
genes. The % of target genes among Set 1 genes (n = 20), Set 2 genes 
(n = 117) and non-called genes (NC, transcripts expressing above back- 
ground, but not identified by significantly involved calls (SIcall, n = 6055)) 
are given. For each miRNA, the numbers of target genes vs nontarget 
genes in Set 2 vs NC genes were compared using tests, with P-values 
shown in the rightmost column. 



Deducing mlRNA involvement from transcriptome data 
We next investigated the possibility that our observed DG 
transcriptome changes in mental illness could be the result of 
dysregulated miRNA signaling. The deduction of miRNA involve- 
ment from mRNA gene expression profiles relies on computa- 
tional approaches that match miRNAs to target genes by 
searching the 3'UTR of potential miRNA target genes for 6-8 bp 
miRNA binding sites. Yet only a subset of miRNA binding sites and 
target genes identified by purely computational approaches is 
biologically relevant. 

To address this problem we applied a two-step approach in 
which we used our most heavily weighted genes (Set 1, weights of 
4 or greater, n = 21) to discover possible miRNA involvement, 
using TargetScan. Genes with a total comparison weight 1-3 were 
assigned to Set 2 (n = 117) (Supplementary Table 1). We 
hypothesized that if DG transcriptome changes in psychiatric 
conditions result at least to some extent from dysregulation of 
signaling by a miRNA, both heavily weighted (Set 1) and more 
lightly weighted genes (Set 2) should have overrepresentation of 
target genes for this particular miRNA compared with the 
remainder of transcripts expressing above background but not 
identified by SIcall ('non-called genes', NC, n = 6055). We further 
hypothesized that targeting by this miRNA would be strongest in 
Set 1 genes, followed by Set 2 genes, followed by NC genes. 

Twenty miRNAs or miRNA families targeted at least 25% (^5) of 
Set 1 genes (Supplementary Table 3). Among these, two followed 
the hypothesized pattern of a drop in targeting rates from Set 1 
over Set 2 to NC genes, with statistically significant differences in 
the number of targeted genes between Set 2 and NC genes: 
miR-1 82 and the m i R-30a bed ef/30a be-5 p/3 84-5 p family. After 
Bonferroni correction for the number of tests performed, only 
miR-1 82 remained statistically significant (Table 2). 

A higher proportion of miR-1 82 target genes among genes 
identified by SIcall compared with NC genes was also observed if 
length scaling was used for normalization. A total of 27 of 158 
genes identified by SIcall (any weight) in our method 9 normalized 
data set were miR-1 82 targets, compared with 636 targets among 
the corresponding 6035 NC genes 0^==6.9, P = 0.009). 

Validation of miR-1 82 involvement in shaping DG granule cell 
transcriptomes 

Saus et al^^ have shown that a C to 7 substitution in the single- 
nucleotide polymorphism rs76481776 leads to overexpression of 
miR-1 82 in T- vs C-allele carriers and causes a significant reduction 
in target gene expression. The minor allele frequency of 
rs76481 776 in our subjects was 8.9%, that is, 1 3 of our 79 subjects 
were T-allele carriers, one of them a T/T homozygote; the 
remaining 66 individuals had the C/C genotype. Our minor allele 
(T) frequency of 8.9% was in good agreement with the previously 
reported 7.5% in Spanish subjects.^^ Genotypes were in Hardy— 
Weinberg equilibrium (not shown). 

We hypothesized that whenever miR-1 82 was active in shaping 
DG gene expression profiles, we would be able to observe a 
statistically significant difference in target gene expression 
between T-allele carriers (C/T or T/T genotype) and those with 
the C/C genotype. On the other hand, no difference in miR-1 82 
target gene expression between rs76481776 T-allele carriers vs 
noncarriers would indicate that miR-1 82 is not actively involved in 
regulating the transcriptome. To use the functional variant 
rs76481776 as a detector of miRNA-182 action on the transcrip- 
tome, we compared the expression levels of miR-1 82 target genes 
between carriers and noncarriers of the uncommon T-variant in 
each of our three disease groups (schizophrenia, bipolar disorder, 
major depression) and in nonpsychiatric controls. 

The differences in the mean miR-1 82 target expression levels 
between carriers and noncarriers of the rs76481776 T-allele varied 
as a function of the psychiatric diagnosis (F= 13.10, P< 10""^). We 



observed significant differences in mean miR-1 82 target expres- 
sion levels between carriers and noncarriers of the T-allele in 
nonpsychiatric controls (f = 4.77, P<10~^) and in subjects with 
bipolar disorder (f = -3.48, P<10~'^). By contrast, target gene 
expression levels did not differ significantly by genotype group in 
individuals with schizophrenia (f = -1.61, P = 0.108) or major 
depression (f = 0.88, P = 0.380). Hence, although miR-1 82 targeting 
is active in DG granule cells of control subjects and individuals 
with bipolar disorder, it appears to be lost in subjects with 
schizophrenia and major depression. Using our alternative 
normalization method (length scaled data) we could confirm loss 
of miR-1 82 signaling in subjects with depression, but not in 
schizophrenia (data not shown). 



DISCUSSION 

This study is, to the best of our knowledge, the first application of 
RNA-seq to cells isolated from postmortem human brain by LCM. 
Our whole transcriptome analysis approach, SIcall, will be a useful 
addition to the tool chest of other currently available analysis 
methods. To our knowledge, our study is also the first to use 
genetic methods for validation of miRNA gene targeting in global 
transcriptomes. 

Gene expression studies in humans are made difficult by the 
marked heterogeneity of subjects, which strongly reduces 
statistical power.^^ We therefore tried to maximize our power to 
detect differences between groups by collecting a larger subject 
group, using samples from three different brain banks. None- 
theless, variability of subject and sample characteristics creates 
important confounders in transcriptome comparisons. Previous 
studies have indicated that RNA quality, brain pH, postmortem 
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interval, subject gender, ethnicity, age, disease duration, drug 
treatment history, suicide status, alcohol and substance abuse 
comorbidity can affect results.^ ^ Among these, RNA integrity and 
the factors directly affecting it such as postmortem interval and 
brain pH have by far the strongest impact.^°'^^ Analysis of 
postmortem human brain has shown that a longer postmortem 
interval and lower tissue pH lead to decreased RNA integrity, 
which introduces noise into gene expression data.""^"^ This is 
further exacerbated by a previously demonstrated 30% drop in 
RNA integrity during LCM.^^ Other covariates may create false 
positive reports of differential gene expression or mitigate true 
gene expression differences. For example, a previous study of 
SMRI samples has shown that higher cumulative lifetime 
antipsychotic dose probably normalizes some of the inherent 
molecular changes of schizophrenia.^^ Additional difficulty is 
created by the fact that not all relevant subject information might 
be known. For example we did not have access to family 
psychiatric history or lifetime exposure to psychotropic drugs for 
subjects from the UCLA and UW brain banks. We investigated the 
influence of subject age, our potentially most relevant confounder, 
on transcriptome differences, and found that it did not affect our 
results. Given the large number of potentially confounding subject 
variables, however, we cannot exclude the possibility that other 
confounders might have influenced our observed gene expression 
patterns. 

The statistical analysis of transcriptome data traditionally relies 
on separate comparison of expression levels for each gene 
between case and control conditions, resulting in a report of fold 
changes for each gene. To explore the cooperative action of 
groups of genes, we used an alternative regression-based analysis 
approach (SIcall) looking at the simultaneous actions of up to five 
genes. Although models with higher numbers of participating 
genes are possible, their computational cost is prohibitive. For 
each model, our algorithm generates a large number of logistical 
regressions representing the many ways in which small groups of 
genes can cooperatively characterize transcriptome differences 
between two groups. Similar approaches have previously been 
used in the analysis of gene expression data.^^ For each 
comparison we set the threshold of the probability at which a 
gene would be considered involved in shaping gene expression 
profiles to 0.05. In other words, if a gene had at least 5% 
probability of being featured in one of five sets of logistic 
regression models allowing for the simultaneous action of either 1, 
2, 3, 4 or 5 genes at a time, it was listed as significantly involved 
and entered our subsequent analysis steps. It should be noted that 
this 5% represents an empirically chosen probability threshold 
which does not correspond to statistical significance. Genes were 
weighted by the number of times they were called per 
comparison (up to five), and the total weights across all seven 
comparisons (up to a theoretical maximum of 35). It should be 
noted that total gene weights are not quantitative in the way 
gene expression changes are, but they rather represent rough 
measures of the likelihood of the involvement of a given gene. 
Only the results of the simplest SIcall models, such as those which 
investigate one gene at a time, roughly correspond to the 
traditional idea of gene-by-gene differential expression. 

Including individual disease vs disease comparisons (e.g. 
schizophrenia vs major depression) in our analysis was based on 
the hypothesis that any gene significantly involved in shaping a 
disease-specific transcriptome might also reveal itself in compar- 
ison of this disease with any other psychiatric condition. For 
example, the gene C9orf102 is heavily weighted in both the 
bipolar disorder vs control, and the bipolar disorder vs depression 
comparisons. We can hypothesize from this that C9orfW2 
expression might potentially be useful as a biomarker differentiat- 
ing bipolar disorder from major depression, warranting further 
experimental exploration and confirmation. The genes OPTN, 
FAMU4A, OXSRl, RLF and TLLl are heavily weighted in the 



comparison of schizophrenia against major depression, but are 
not called in any other comparison. Our analysis does not reveal 
which of these genes might be involved in schizophrenia, which in 
depression, or which in both, the latter as a result of opposing 
gene expression changes in the two conditions. Nonetheless, the 
fact that they are called by our analysis indicates that investigating 
them further might yield insights into broad processes which may 
be dysregulated in major depression or schizophrenia. Our 
inclusion of an 'all disease' vs nonpsychiatric control comparison 
was motivated by the hypothesis that major psychiatric conditions 
might share subtle transcriptome changes that are detectable only 
if larger groups are compared. However, contrary to this 
expectation, there were relatively few genes that had weights 
>1 and were called only in the all disease vs control, but not in any 
other comparison. 

Prior gene expression studies in postmortem human brain have 
successfully implicated broad systems dysfunction in mental 
illness.^°" Most of these studies have used microarray technol- 
ogy, but RNA-seq has been employed in more recent work.^^"^^ 
The vast majority of prior studies investigated tissue blocks as 
opposed to near-homogeneous cell populations isolated by LCM. 
One previous study exists in which DG gene expression profiles 
were compared in subjects with schizophrenia, bipolar disorder, 
major depression and nonpsychiatric controls, using LCM and 
microarrays.^^ The authors found decreased expression of genes 
related to protein turnover, energy metabolism and neuronal 
functions in subjects with schizophrenia compared with controls. 
No significant transcriptome changes were observed in subjects 
with major depression or bipolar disorder. 

Although prior gene expression studies have been consistent in 
reporting systems-level dysfunction in the brains of subjects with 
mental illness, for example, inflammation, observations of 
differential expression for individual genes have been far less 
reproducible. In a meta-analysis of 12 genome-wide expression 
studies in postmortem brain of subjects with bipolar disorder 
compared with controls, Elashoff et aC"^ have shown that the 
likelihood of a gene reported as differentially expressed in one 
study having a repeat finding in one of 1 1 other studies was only 
9%. This lack of robustness in gene expression findings has 
previously been attributed to interacting factors such as tissue pH 
and subject age or gender.^^ Our findings indicate that the choice 
of suboptimal normalization methods may be an additional 
contributing factor. Under our experimental conditions the 
generally accepted standard of transcriptome normalization, the 
RPKM method, showed inferior performance compared with the 
other approaches. 

We hypothesized that our observed gene expression changes 
might have occurred as the result of an overarching dysregulation 
of gene expression in mental illness. We chose to look at 
posttranscriptional regulation by miRNAs because of strong 
evidence of their involvement in DG neurogenesis and major 
psychiatric disorders. The miRNAs are a class of small noncoding 
RNAs which inhibit expression of groups of target genes through 
degradation or inhibition of their mRNAs. Over half of miRNAs are 
highly or exclusively expressed in brain, where they participate in 
neurogenesis and neuronal plasticity.^^"^^ Changes in miRNA 
expression in postmortem human brain have previously been 
shown in schizophrenia,^^"^"^ bipolar disorder^^'^^ and depression.^^ 

Contrasting with our approach, prior studies have relied on the 
direct profiling of miRNA expression using a preselected panel of 
miRNAs. Accurate direct profiling of miRNAs in postmortem 
human brain could be compromised, however, by potentially 
limited miRNA stability in neuronal cells. Although studies in 
embryonic cell lines and peripheral organs such as liver and heart 
have shown that miRNAs can be highly stable molecules with half- 
lives of up to several days,^^"^^ comparing miRNA decay in 
neurons with that in non-neuronal cells, Krol et al7^ found miRNA 
decay in neurons to be activity-dependent, and occurring much 
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faster than in non-neuronal cells. Their observation of miRNA half- 
lives of less than 1 h agrees with other findings in human primary 
neuronal cells and short postmortem interval human neocortex, 
where Sethi and Lukiw^° reported miRNA half-lives ranging from 
1 h to about 3.5 h. The latter would mean that during a 
postmortem interval of 24 h, which is the case for many human 
subjects in publicly available brain collections, more than 99% of 
miRNA molecules have degraded.^° Our ability to directly detect 
changes in miRNA expression in our subjects was further 
compromised by our use of aRNA amplification. Although miRNAs 
are translated from polyadenylated transcripts, the poly-A tail is 
lost in mature miRNAs. Hence the bulk of mature miRNAs was lost 
during aRNA amplification before the preparation of sequencing 
libraries. Possibly as a result of that, we were not able to detect 
expression of mir-182 or any member of the miR-30 family in our 
subjects. 

We believe our study is the first to show evidence of disrupted 
miR-182 signaling in schizophrenia and major depression. 
Members of the miR-30 family, however, have been repeatedly 
shown to have decreased expression in schizophrenia.^^ It should 
be noted that only a minority of genes identified in our RNA-seq 
analysis were actually miR-182 targets. Hence, it is clear that we 
were able to discover only one of possibly a multitude of 
regulatory mechanisms accountable for shaping DG transcriptome 
changes in mental illness. 

miR-182 is part of a cluster of three miRNAs, miR-96, miR-182 and 
miR-183, which are colocated within a 4kb genomic segment 
located at 7q32.2.^^'^^ miR-182 is involved in a broad range of 
biological processes including regulation of the immune response, 
DNA repair, cell proliferation and differentiation, and regeneration 
of peripheral nerves after injury .^"^"^^ miR-182 is the highest 
expressing miRNA in the pineal gland, where it accounts for 28% 
of the miRNA population. There, miR-182 has a rhythmic pattern of 
gene expression involving approximately two-fold changes 
between the highest levels of expression between 6 am and 12 
pm, and the lowest levels during the night.^° In keeping with this, 
Saus et alf^^ found an association between the rs76481776 
polymorphism of the miR-182 gene and patterns of insomnia in 
patients with major depression. Disruption of sleep and circadian 
rhythmicity are cardinal features of both schizophrenia and major 
depression, which aligns with our findings of a possible loss of 
miR-182 signaling in these conditions.^^"^^ 



CONCLUSIONS 

Whole transcriptome analysis by RNA-seq in LCM-isolated DG 
granule cells of postmortem hippocampus in subjects with mental 
illness and controls showed evidence of disrupted miR-182 
signaling in subjects with major depression and schizophrenia. 
We validated this finding by showing how the impact of a 
functional miR-182 single-nucleotide polymorphism on target 
gene expression was lost in subjects with schizophrenia and major 
depression. Our paper is the first study in which LCM is combined 
with RNA-seq in postmortem human brain. Under these challen- 
ging experimental conditions, a noise reduction scaling normal- 
ization method outperformed normalization by exon or transcript 
length. We also demonstrate the feasibility of a novel, regression- 
based method for RNA-seq analysis (SIcall) which allows for the 
investigation of cooperative action among small sets of genes and 
is a useful complement to existing approaches. 
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